Fast Penalized Regression and Cross Validation for Tall Data with the <b>oem</b> Package

نویسندگان

چکیده

A large body of research has focused on theory and computation for variable selection techniques high dimensional data. There been substantially less work in the big "tall" data paradigm, where number variables may be large, but observations is much larger. The orthogonalizing expectation maximization (OEM) algorithm one approach penalized models which excels tall regime. oem package an efficient implementation OEM provides a multitude routines with focus data, such as function out-of-memory computation, large-scale parallel regression models. Furthermore, this paper we propose specialized cross validation, dramatically reducing computing time validation over naive implementation.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified Cross-Validation for Penalized High-Dimensional Linear Regression Models

In this article, for Lasso penalized linear regression models in high-dimensional settings, we propose a modified cross-validation (CV) method for selecting the penalty parameter. The methodology is extended to other penalties, such as Elastic Net. We conduct extensive simulation studies and real data analysis to compare the performance of the modified CV method with other methods. It is shown ...

متن کامل

Simple one-pass algorithm for penalized linear regression with cross-validation on MapReduce

In this paper, we propose a one-pass algorithm on MapReduce for penalized linear regression fλ(α, β) = ‖Y − α1−Xβ‖ 2 2 + pλ(β) where α is the intercept which can be omitted depending on application; β is the coefficients and pλ is the penalized function with penalizing parameter λ. fλ(α, β) includes interesting classes such as Lasso, Ridge regression and Elastic-net. Compared to latest iterativ...

متن کامل

Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data

Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized re...

متن کامل

Fast function-on-scalar regression with penalized basis expansions.

Regression models for functional responses and scalar predictors are often fitted by means of basis functions, with quadratic roughness penalties applied to avoid overfitting. The fitting approach described by Ramsay and Silverman in the 1990 s amounts to a penalized ordinary least squares (P-OLS) estimator of the coefficient functions. We recast this estimator as a generalized ridge regression...

متن کامل

Penalized Regression with Ordinal Predictors

Ordered categorial predictors are a common case in regression modeling. In contrast to the case of ordinal response variables, ordinal predictors have been largely neglected in the literature. In this article penalized regression techniques are proposed. Based on dummy coding two types of penalization are explicitly developed; the first imposes a difference penalty, the second is a ridge type r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Statistical Software

سال: 2022

ISSN: ['1548-7660']

DOI: https://doi.org/10.18637/jss.v104.i06